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Abstract 

This paper discusses the problem of determining optimal designs for regression mod¬ 
els, when the observations are dependent and taken on an interval. A complete solution 
of this challenging optimal design problem is given for a broad class of regression models 
and covariance kernels. 

We propose a class of estimators which are only slightly more complicated than the ordi¬ 
nary least-squares estimators. We then demonstrate that we can design the experiments, 
such that asymptotically the new estimators achieve the same precision as the best linear 
unbiased estimator computed for the whole trajectory of the process. As a by-product 
we derive explicit expressions for the BLUE in the continuous time model and analytic 
expressions for the optimal designs in a wide class of regression models. We also demon¬ 
strate that for a finite number of observations the precision of the proposed procedure, 
which includes the estimator and design, is very close to the best achievable. The results 
are illustrated on a few numerical examples. 
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1 Introduction 

Optimal design theory is a classical field of mathematical statistics with numerous applications 
in life sciences, physics and engineering. In many cases the use of optimal or efficient designs 
yields to a reduction of costs by a statistical inference with a minimal number of experiments 
without loosing any accuracy. Most work on optimal design theory concentrates on experiments 
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with independent observations. Under this assumption the field is very well developed and a 
powerful methodology for the cons truction of optimal designs has been established [see for 
example the monograph of Pukelsheim (2006)]. While important and elegant results have been 
derived in the case of independence, there exist numerous situations where correlation between 
different observations is present and these classical optimal designs are not applicable. 

The theory of optimal design for correlated observations is much less developed and explicit re¬ 
sults are only available in rare circumstances. The challenging difficulty consists here in the fact 
that - in contrast to the independent case - correlations yield to non-convex optimization prob¬ 
lems and classical tools of convex optimization theory are not app li cable . Some exact optimal 
design s for specific linear mod e ls hav e been studied in iDette et al.l (120081 1 ; iKiselak and Stehlfk 
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Sac 

;ks and Ylvisaker ( 

1966, 

1968) 

Bickel and Her zb erg 

(1979) 

Nat her 

(1985a) 

Zhigliavskv et al. 
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)], where the refer- 


ences differ in the asymptotic arguments used to embed the discrete (nou-convex) optimization 
problem in a continuous (or approximate) one. However, in contrast to the uncorrelated case, 
this approach does not simplify the problem substantially and due to the lack of convexity the 
resulting approximate optimal design problems are still extremely difficult to solve. As a con¬ 
sequence, optimal designs have mainly been determined analytically for the location model (in 


els [see 

Boll 

;ze and Neither 

(1982' 

, Neither 

(1985a 

), Ch. 4, Neither 

(1985bf), 

Pazman and Muller 

(2001 

) and 

Muller and Pazman ( 

2003 

) among others]. Only recently, 

Dette et al. 

(2013 

) deter- 


mined (asymptotic) optimal designs for least squares estimation in models with more param¬ 
eters under the additional assumption that the regression functions are eigenfunctions of an 
integral operator associated with the covariance kernel of the error process. However, due to 
this assumption, the class of models for which approximate optimal designs can be determined 
explicitly is rather small. 

The present paper provides a complete solution of this challenging optimal design problem for 
a broad class of regression models and covariance kernels. Roughly speaking, we determine 
(asymptotic) optimal designs for a slightly modified ordinary least squares estimator (OLSE), 
such that the new estimate and the corresponding optimal design achieve the same accuracy 
as the best unbiased linear estimate (BLUE) with corresponding optimal designs. 

To be more precise, consider a general regression observation scheme given by 


y{tj) = d T f{tj) + e{tj), j = 1 ,..., N , 


( 1 . 1 ) 


where E[e(£j)] = 0, K(ti,tj ) = E[e(U)£(t,-)] denotes the covariance between observations at 
the points U and tj. ( i,j = 1 ,..., N), 9 = [9 1 ,..., 9 m ) T is a vector of unknown parameters, 
f{t ) = ..., fm{t)) T is a vector of linearly independent functions, the explanatory variables 


2 








































































ti, ■ ■ ■ ,tx vary in a compact interval, say [a, b]. Parallel to model (II. ip we also consider its 
continuous time version 


y(t) = 0 T fit) + e{t) , te[a,b], (1.2) 

where the full trajectory of the process {y(t)\t G [a,b]} can be observed and {e:(i)|£ G [a, 6]} is 
a centered Gaussian process with covariance kernel Ah i.e. K(s,t ) = E[e:(s)e:(t)]. This kernel is 
assumed to be continuous throughout this paper. 

We pay much attention to the one-parameter case and develop a general method for solving 
the optimal design problem in model (11.2 j) explicitly for the OLSE, perhaps slightly modified. 
The new estimate and the corresponding optimal design achieve the minimal variance among 
all linear estimates (obtained by the BLUE). In particular, our approach allows to calculate 
this optimal variance explicitly. As a by-product we also identify the BLUE in the continuous 
time model (jl.2j) . Based on these asymptotic considerations, we consider the finite sample case 
and suggest designs for a new estimation procedure (which is very similar to OLSE) with an 
efficiency very close to the best possible (obtained by the BLUE and the corresponding optimal 
design), for any number of observations. In doing this, we show how to implement the optimal 
strategies from the continuous time model in practice and demonstrate that even for very small 
sample sizes the loss of efficiency with respect to the best strategies based on the use of BLUE 
with a corresponding optimal design can be considered as negligible. We would like to point out 
at this point that - even in the one-dimensional case - the problem of numerically calculating 
optimal designs for the BLUE for a fixed sample size is an extremely challenging one due to 
the lack of convexity of the optimization problem. 

In our approach, the importance of the one-parameter design problem is also related to the 
fact that the optimal design problem for multi-parameter models can be reduced component- 
wisely to problems in the one-parameter models. This gives us a way to generate analytically 
constructed universally optimal designs for a wide range of continuous time multi-parameter 
models of the form (11.2 j) . Our technique is based on the observation that for a finite number of 
observations we can always emulate the BLUE in model (II.ip by a different linear estimator. 
To achieve that theoretically we assign signs to the support points of a discrete design and 
not only weights in the one-parameter models, but in the multi-parameter case we use matrix 
weights. We then determine “optimal” signs and weights and consider the weak convergence 
of these ‘designs” and estimators as the sample size converges to infinity. Finally, we prove the 
(universal) optimality of the limits in the continuous time model (11.21b 

Theoretically, we construct a sequence of designs for either the pure or a modified OLSE, say 
#jv, such that its variance or covariance matrix satisfies Var(#jv) —i > D* as the sample size N 
converges to infinity, where D* is the variance (if m = 1) or covariance matrix (if m > 1) 
for the BLUE in the continuous time model (11.21) . In other words, D* is the smallest possible 
variance (or covariance matrix with respect to the Loewner ordering) of any unbiased linear 
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estimator and any design. This makes the design s d erived in this pape r very competitive in 
applications against the designs proposed by Sacks and Ylvisaker (119661 ) and optimal designs 
constructed numerically for the BLUE (using the Brimkulov-Krug-Savanov algorithm, for ex¬ 
ample). We emphasize once again that due to non-convexity the numerical construction of 
optimal designs for the BLUE is extremely difficult. An additional advantage of our approach 
is that we can analytically compute the BLUE with the corresponding optimal variance (co- 
variance matrix) D* in the continuous time model (1 1.2 j) and therefore monitor the proximity 
of different approximations to the optimal variance D* obtained by the BLUE. 

The methodology developed in this paper results in a non-standard estimation and optimal 
design theory and consists in a delicate interplay between new linear estimators and designs in 
the models (II. ip and (II. 2p . For this reason let us briefly introduce various estimators, which 
we will often refer to in the following discussion. Consider the model (11.1 ft and suppose that 
N observations are taken at experimental conditions t \,..., Uv- For the corresponding vector 
of observations Y = (y(ti), ..., y(Uv)) T , a general weighted least squares estimator (WLSE) of 
6 is defined by 


WLSE : 
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WLSE 


= (X'WXJ-A'WY, 


(1.3) 


where X = (/Vj))UCS is an N x m design matrix and W is some N x N matrix such that 
(X T WX)- x exists. For any such W the estimator (11.3ft is obviously unbiased. The covariance 
matrix of the estimator (11.3ft is given by 

Var (9 WL se) = (X T W X)- x X T W £W r X(X T W T X)" 1 , (1.4) 


where S = (iC(U, tj))i,j=i,...,N is an N xN matrix of variances/covariances. For the standard 
WLSE the matrix W is symmetric non-negative definite; in this case 9wlse minimizes the 
weighted sum of squares SSw(9) = (Y — Xd) T W(Y — X0) with respect to 9. Important 
particular cases of estimators of the form (11.3ft are the OLSE, the best unbiased linear estimate 
(BLUE) and the signed least squares estimate (SLSE): 

OLSE : Oolse = (X t X) _1 X t Y, (1.5) 

BLUE : Oblue = (X T S~ 1 X)- 1 X T S' 1 Y, (1.6) 

SLSE: Oslse = (X t SX)- 1 X t SY. (1.7) 


Here S is an NxN diagonal matrix with entries +1 and —1 on the diagonal; note that if S ^ Ijv 
then SLSE is not a standard WLSE. While the use of BLUE and OLSE is standard, the SLSE 
is less common. It wa s introduced in iBoltze and Natheil (119821 1 and further studied in Chapter 
5.3 of Neither ( 1985an . In the content of the present paper, the SLSE will turn out to be very 
useful for constructing optimal designs for OLSE and the BLUE in the model (11.2ft with one 
parameter, where the full trajectory can be observed. Another estimate of 9, which is not a 
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special case of the WLSE, will be introduced in Section [3] and used in the multi-parameter 
models. 

The remaining structure of the paper is as follows. In Section [2] we derive optimal designs for 
continuous time one-parameter models and discuss how to implement the designs in practice. 
In Section [3] we extend the results of Section [2] to multi-parameter models. In Appendix iBl we 
discuss transformations of regression models and associated designs, which are a main tool in 
the proofs of our result but also of own interest. In particular , we provide a n exten sion of the 
famous Doob representation for Gaussian processes [see iDoobI ( 1949 ) and lMehr and McFaddenl 
£3)], which turns out to be a very important ingredient in proving the design optimality 
results of Sections [2] and [3j Finally, in Appendix lAl we collect some auxiliary statements and 
proofs for the main results of this paper. 


2 Optimal designs for one-parameter models 

In this section we concentrate on the one-parameter model 

y(tj) = 9f(tj) + e(tj); j = l,...,N, (2.1) 

on the interval [a, b] and its continuous time analogue, where E[e(i)] = 0 and E[e(t)£(f')] = 
K(t,t'). Our approach uses some non-standard ideas and estimators in linear models and 
therefore we begin this section with a careful explanation of the logic of the material. 

Sect. 1 2.1\ Under the assumption that the design space is finite we show in Lemma 12.11 1hat 
by assigning weights and signs to the observation points {U,..., tjv} we can construct a 
WLSE which is equivalent to the BLUE. Then, we derive in Corollary 12.II an explicit form 
for the optimal weights for a broad class of covariance kernels, which are called triangular 
covariance kernels. 

Sect. UD1 We demonstrate in Theorem 12.11 that the optimal designs derived in Sect. 12.11 
converge weakly to a signed measure, if the cardinality of the design space converges to 
infinity. 

Sect. \2.M We consider model (12 .1 j) under the assumption that the full trajectory of 
the process {y(t)\t G [a, 6]} can be observed. For the specific case of Brownian motion, 
that is K(t,t') = min{£, £'}, we prove analytically the optimality of the signed measure 
derived in Theorem 12.11 for OLSE. Then, in Theorem 12.31 we establish optimality of the 
asymptotic measures from Theorem 12. II for general covariance kernels. As a by-product we 
also identify the BLUE in the continuous time model (11.2ft (in the one-dimensional case). 
For this purpose, we introduce a transformation which maps any regression model with a 
triangular covariance kernel into another model with different triangular kernels. These 
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transformations allow us to reduce any optimization problem to the situation considered 
in Theorem 12.21 which refers to the case of Brownian motion. The construction of this map 
is based on an extension of the celebrated Doob’s representation which will be developed 
in Appendix iBl 


Sect. 2-4 


We provide some examples of asymptotic optimal measures for specific models. 


Sect. 12.51 We introduce a practical implementation of the asymptotic theory derived in 
the previous sections. For a finite sample size we construct WLSE with corresponding 
designs which can achieve very high efficiency compared to the BLUE with corresponding 
optimal design. It turns out that these estimators are slightly modified OLSE, where only 
observations at the end-points obtain a weight (and in some cases also a sign). 


Sect. 12.6 1 We illustrate the new methodology in several examples. In particular, we give 
a comparison with the best known procedures based on BLUE and show that the loss in 
precision for the procedures derived in this paper is negligible with our procedures being 
much simpler and more robust than the procedures based on BLUE. 


2.1 Optimal designs for SLSE on a finite design space 

In this section, we suppose that the design space for model ( 12 . 11 ) is finite, say T — {U, • • •, tv}, 
and demonstrate that in this case the approximate optimal designs for the SLSE (11.71) can 
be found explicitly. Since we consider the SLSE (11.71) rather than the OLSE (11,5p . a generic 
approximate design on the design space T = {U,... ,t^} is an arbitrary discrete signed mea¬ 
sure f = {t u ...,t N -,w 1 ,..., w N }, where w t = s^, .s t G {-1,1}, Pi > 0 (i = 1,..., N) and 
Y^i=iPi — 1 - We assume that the support U, ■ ■ ■ ,Uv of the design is fixed but the weights 
Pi, ■ ■ ■ ,Pn and signs si,..., Sjy, or equivalently the signed weights Wi , will be chosen to mini¬ 
mize the variance of the SLSE (11.71) . I 11 view of (11.41) . this variance is given by 

n n n 2 

D (0 = ^^K{U,t j )w i w j f{t i )f(t j )/(^2w i f 2 {t i )') . ( 2 . 2 ) 

i =1 j=1 i =1 

Note that this expression coincides with the variance of the WLSE (11.21) . where the matrix W 
is defined by W = diag(uq, ... , wn )- 

We assume that f(ti) 7 ^ 0 for all i — 1,..., N. If f(tj) = 0 for some j then the point tj can 
be removed from the design space T without changing the SLSE estimator, its variance and 
the corresponding value D(£). In the above definition of the weights w i} we have YliLi \ w i\ = 
= 1- Note, however, that the value of the criterion (12.21) does not change if we change 
all the weights from Wi to cWi (i — 1 ,..., N ) for arbitrary c^O. 

Despite the fact that the functional D in (12.21) is not convex as a function of (wi ,... , wn ), the 
problem of determining the optimal design can be easily solved by a simple application of the 
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Cauchy-Schwarz inequa l ity. The proof of the following lemma is given in Appendix [X] [see also 
Theorem 5.3 in Natherl (ll985al ). where this result was proved in a slightly different form]. 


Lemma 2.1 Assume that the matrix E = (Ii (U, t/))ij=i....,Ar is positive definite and f{tf) 0 
for all i = 1 Then the optimal weights w{,... ,w%- minimizing (12. 2 p subject to the 

constraint Ylv=i \ w i\ = 1 are given by 


w { =c 


efE _1 f 

f{U) 5 




(2.3) 


where f = ..., /(Uv)) T , e,; = (0, 0,..., 0,1, 0,..., 0) T G M. N is the i-th unit vector, and 


c = Qr|ef£- 1 f//(f i )(r • 

i=l 

Moreover, for the design £* = {ti ,..., Uv; w*, • • •, u>n} with weights (12.3ft we have D(£*) = D*, 
where D* = l/(f T E - 1 f) ; the variance of the BLUE defined in (II.6ft using all observations 
1 1 , - - -, t]\f . 


Lemma 12.11 shows, in particular, that the pair {SLS estimate, corresponding optimal design £*} 
provides an unbiased estimator with the best possible variance for the one-parameter model 
(12. ip . This results in a WLSE (II . 2j) with W* = diag^jp,..., w^) which is BLUE. In other 
words, by a slight modification of the OLSE we are able to emulate the BLUE using the 
appropriate design or WLSE. 

While the statement of Lemma 12.11 holds for arbitrary kernels, we are able to determine the 
optimal weights w* more explicitly for a broad class, which are called triangular kernels and 
are of the form 


K(t,t') = u(t)v(t') for t < t r , 


(2.4) 


where u(-) and v(-) are some functions on the interval [a, b\. Note that the majority of co- 
variance kernels conside red in literature belong to this c lass, see for example iNatherl (Il985al) : 


Zhigljavskv et al.l (120101) or lHarman and Stulajterl (120111 ) . The following result is a direct con¬ 


sequence of Lemma IA.1I from Appendix [A]. 


Corollary 2.1 Assume that the covariance kernel K (-, •) has the form (12.4p so that the matrix 
E = . 5 at is positive definite and has the entries K(ti,tj ) = UiVj for i < j, where 

for k = we denote u k = u{t k ), v k = v(t k ), and also f k = f(t k ), q k = u k /v k . If 
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fifO (i — 1,..., N), the weights in (12,3j) can be represented explicitly as follows: 


' O’l — ~r (011/1 + 012/2) — 


cu 2 


W N = 


fl 

c 

In 


fiviv 2 (q2 -qi)\U[ u 2 


) 

VMi Mo/ ’ 


\&N,N fN + &N-1,N fN-l) — 


([n__ [n -1 
fN^N^qN — qN- 1) 'Vn Vn -1 


w i ~ ~T (c/j/i + (Ti-ljfi-l + &i,i+lfi+l) 
Ji 


(2.5) 

( 2 . 6 ) 

(2.7) 


c I (Oi+i _ ft -1_ /■ 1 1 

- O.-ii I’.-i V - Oi -1 i ^+1(^+1-^)/’ 

/or i — 2,..., AT — 1. In formulas (12.51) . (I2.6j) and (12. 7p . t/ie quantity dij denotes the element 
in the position ( i,j ) of the matrix S _1 = (cr?j)ij=i,...,jv- 


2.2 Weak convergence of designs 


In this section, we consider the asymptotic properties of designs with weights (j2.5j) - (12.71) . 
Recall that the design space is an interval, say [a, b], and that we assume a triangular covariance 
function of the form (12.4ft . According to the discussion of triangular covariance kernels provided 
in Section 4.1 of Appendix iBl the functions w(-) and v(-) are continuous and strictly positive 
on the interval (a, b) and the function qf) = is positive, continuous and strictly 

increasing on (a, b). We also assume that the regression function / in (12.11) is continuous and 
strictly positive on the interval (a, b). We define the transformation 


Qit) 


q{t) - q(g ) 
q(b) - q(a ) 


( 2 . 8 ) 


and note that the function Q : [a, b] —> [0,1] is increasing on the interval [a, b] with Q(a ) = 0 
and Q{b) = 1, that is Q(-) is a cumulative distribution function (c.d.f.). For fixed N and 
i = 1,..., N, define z^n = {i — \)/N and the design points 


ti,N — Q (.£i,iv) 1 l !)•••) N . 


(2.9) 


Theorem 2.1 Consider the optimal design problem for the model (12.11) . where the error process 
e(f) has the covariance kernel K(t,s) of the form (12.41) . Assume that u(-), v(-), /(•) and q(-) 
are strictly positive, twice continuously differentiable functions on the interval [a,b\. Consider 
the sequence of signed measures 


6v = {fi ,Nt ■ ■ ■> In,N] w 1,Ni ■ ■ ■ ■> ■WWtv}, 


where the support points t^N are defined in (12. 9 j) and the weights w^n are assigned to these 
points according to the rule (1 2.3 1) of Lemma \2.1[ Then the sequence of measures {£tv}tvgn 
converges in distribution to a signed measure f*, which has masses 


Pn = 


f(a)v 2 (a)q'(a ) L u{a 


f(a)u'(a) 


~ f( a ) 


Ph = c • 


ti{b) 


f{b)v(b)q'{b) 


( 2 . 10 ) 





























at the points a and b, respectively, and the signed density 


P(t ) 


m 

L q'{t) 


( 2 . 11 ) 


(that is, the Radon-Nikodym derivative of £* with respect to the Lebesque measure) on the 
interval (a,b), where the function h{-) is defined by h(t) = f(t)/v{t). 


The proof of Theorem 12.11 is technically complicated and therefore given in Appendix [A] The 
constant 0 in (12. lOjl and (12. lip is arbitrary. If a normalization |£*|([a, &]) = 1 is required, 
then c can be found from the normalizing condition 



Pa | + \Pb\ + f \p(t)\dt = 1 . 

J a 


Throughout this paper we write the limiting designs of Theorem 12.11 in the form 


C(dt) = P a S a (dt ) + P b 5 b (dt ) + p[t)dt , (2.12) 

where 5 a (dt ) and 5 b (dt) are the Dirac-measures concentrated at the points a and b, respectively, 
and the function p(-) is defined by (12. lip . Note also that under the assumptions of Theorem 12 .1 1 
the function p(-) is continuous on the interval [a, b]. In the case of Brownian motion, the limiting 
design of Theorem 12.11 is particularly simple. 


Example 2.1 If the error process e in model (12. ip is the Brownian motion on the interval [a, b] 
with 0 < a < b < oo, then K(t,s ) = min(t, s) and hence u(t) = t, v(t) = 1, q(t) = t. This 
implies that the limiting design of Theorem 12.11 is given by (12.12p with 


P a = c 


/(a) - f'{a)a 

q/(q) 


Ph = c 


m 

m 


and 


pit) = 


J"(t) 

' m 


(2.13) 


2.3 Optimal designs and the BLUE 

In this section we consider the continuous time model (j 1.2 p in the case m — 1 and demonstrate 
that the limiting designs derived in Theorem 12.11 are in fact optimal. A linear estimator for the 
parameter 9 in model (jl.2p is defined by 9 = f a b ?/(f)/i(<if), where p is a signed measure on the 
interval [a,b\. Special cases include the OLSE and SLSE 9^ yit)fit)£,idt)/ / 2 (t)£(dt), 

where £ is a measure or a signed measure on the interval [a,b], respectively. Note that 9 tJ/ is 
unbiased if and only if f(t)p(dt) = 1 and 9^ is unbiased by construction. The BLUE (in the 
continuous time model (11.21) ) minimizes 

fb r*b 

$(/£)= Var (<y = / / K(x,y)n(dx)iJ,(dy) 

J a J a 
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in the class of all signed measures /i satisfying f(t)p(dt) = 1 , and 

D* = inf{$(//) | /i signed measure on [a, 6]} 


(2.14) 


denotes the best possible variance of all linear unbiased estimators in the continuous time model 

(JL2D. 

Similarly, a signed measure £* on the interval [a, b] is called optimal for least squares estimation 
in the one-parameter model (11.21) . if it minimizes the functional 


D(£) = Var(%) = 



K(t,s)f(t)f(s)£(dt)£(ds) 


fmdt ) 


(2.15) 


in the set of all signed measures £ on the interval [a, b], such that f ( ‘ f 2 (t)£(dt) 0. In the case 

of a Brownian motion, we are able to establish the optimality of the design of Example 12.11 A 
proof of the following result is given in Appendix lAl 


Theorem 2.2 Let {s(t) \ t € [a, 6]} be a Brownian motion, so that K(t,t') = minjt,^}, and f 
be a positive, twice continuously differentiable function on the interval [ a,b] C M + . Then the 
signed measure f*, defined by (12.121) and (12.13ft with arbitrary c ^ 0, minimizes the functional 
(12.151) . The minimal value in (I2.15P is obtained as 


D(e) = mmD(£) 


7 2 («) , 

. a 



(. fW 2 dt 


Moreover, the BLUE in model (11.2ft is given by 9^*, where g*(dt) = f(t)£**(dt) and f** is the 
signed measure defined by (12.12ft and (12.13ft with constant c* = D(£*). This further implies 
D* = D(C) = ®(/x*). 


Based on the design optimality established in Theorem 12.21 for the special case of Brownian 
motion and the technique of transformation of regression models described in Appendix [B] we 
can establish the optimality of the asymptotic designs derived in Theorem 12 .1 1 for more general 
covariance kernels; see Appendix |A] for the proof. 


Theorem 2.3 Under the conditions of Theorem \2.1i the optimal design minimizing the 
functional (12.15j) is defined by the formulas ()2.lGj) - (j2.12j) with arbitrary c ^ 0. The minimal 
value in (12.15j) is obtained as 


D(C) 


r / 2 (g(q)) , 

- q(a) 


rii b ) i -l 

/ (f(t)) 2 dt 

lq(a) 


(2.16) 


where f(t) = f(q 1 (s))/v(q x (s)). Moreover, the BLUE in model (II.2[) is given by 9where 
p*(dt ) = f{t)£**(dt), £** is the signed measure defined in (12.10p - (j2.12[) with constant c* = 

D(C), and D* = $(//*) = D{£*). 
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2.4 Examples of optimal designs 


In this section, we provide the values of P a , Pb and the function p(-) in the general expression 
( 12 . 12 !) f° r th e optimal designs in a number of important special cases for the one-parameter 
continuous time model (II.21) . where the design space is T — [ a,b\. Specifically, optimal designs 
are given in Table [Tj for the location model, in Table [2] for the linear model, in Table [3] for a 
quadratic model and in Table [4] for a trigonometric model. The last named model was especially 
chosen to demonstrate the existence of optimal designs with a density p which changes sign 
in the interval (a,b). In the tables several triangular covariance kernels are considered. The 
parameters of these covariance kernels satisfy the constraints c 2 > ±Ci, =Fc 2 0 [a, 6 ], 7 > u>, 
A > 0. For the sake of a transparent presentation, we use the factor c = 1 in all tables, but we 
emphasize once again that the optimal designs do not depend on the scaling factor. 

As an example, if K(t,t') = e — ^'1 f or some A > 0, we have from the last row of Table [2] that 
the optimal design for the continuous time model {9t + e(t)\t G [1, 2]} is £*(dt) = (A — l)5i(clt) + 
(A + |)5 2 (dt) + X 2 dt, and as a consequence, D* = (| + -A + -A) -1 . 

Table 1: Optimal designs for the location model: f(t) — 1, t £ [a, b\. 


u(t) 

v{t) 

Pa 

Pb 

P(t) 

any 

1 

1 

0 

0 



1 

-1 


Cl+t 

c 2 ± t 

a + ci 


0 

b ± c 2 


P 

p 

- 7 a-^- w 

ujb~^- u 

7 wt _1 ~ 7_a; 

e \t 

e -7t 

A e “(7-A) 

ryffp—X) 

A 7e i(7-A) 


Table 2: Optimal designs for the linear regression model through the origin: f(t ) = t, t e [a, b\. 


u(t) 

v(t) 

Pa 

Pb 

P(t) 

t 

1 

0 

1 

0 

Cl + t 

c 2 ± t 

-ci 

±c 2 

0 

(0 + ci)a 

(b ± c 2 )6 

p 

p 

-( 7 -l)o- 7 -^ 

(w-l)6~ 7 - w 

(1— 7 ) (1—cc)t —1_7_w 

\t 

1 

(aA — l)e aA 

g -feA 

Xe~ tx 


a 

b 

t 

pXt 

p— 7 * 

aX ~ 1 r a( 7 -A) 

h X + 1 6 ( 7 -A) 

A 7 f - 7 + A th _ x) 



a 

_ 

i 


2.5 Practical implementation: designs for finite sample size 

In practice, efficient designs and corresponding estimators for the model (II.ip have to be derived 
from the optimal solutions in the continuous time model (1 1.21) . and in this section a procedure 


11 
























Table 3: Optimal designs for the quadratic regression model: f(t ) = t 2 + u, t e [a, b]. 


u{t) 

v(t) 

Paf(a) 

Pbf{b) 

P(t)f(t) 

t 

1 

(a 2 — n)/a 

-2b 

2 

Cl+t 

C2 ± t 

(a 2 — v + 2aci) 

=F(6 2 -v± 2 bc 2 ) 

2 



a + ci 

b±c 2 


O 

t“ 

((2—y)a 2 — 7 n)a _7_aj 

((u;-2)6 2 +uu/)&-T'- a; 

((2—w)(2— 'y)+n r joj)t 1 ^ 1 ~ U} 

e xt 

1 

(2a — (a 2 + n)X)e~ aX 

-2 be~ bX 

2(1 — t\)e~ tx 

e Xt 

e -\t 

(2a — (a 2 + n)A) 

— ((6 2 + u)A + 26) 

(2-A 2 {t 2 + u)) 


Table 4: Optimal designs for the trigonometric regression model: f(t ) = 1 + | sin(27r£), t e [1, 2]. 


u(t) 

v(t) 

Pa 

Pb 

P(t)f(t) 

t 

1 

(1 - 7r) 

TT 

2t r 2 sin(27rt) 

Cl + 1 

C2 ± t 

1 — 7TCl — 7r 

1 7TC2 — 27T 

27t 2 sin(27rt) 

Cl + 1 

c 2 ±2 

t 2 

t 

(2 — 7r) 

(2 tt - l)/8 

2t” 4 ((7r 2 t 2 — 1) sin(27rt)+7rt cos(27rt) — 1) 

e xt 

1 

(A — 7r)e _A 

7re~ 2X 

(27t 2 sin(27rt) + 7 tA cos(27rt))e _At 

e \t 

e -\t 

(A — tt) 

(A+ 7r) 

((27r 2 + A 2 /2) sin(27rt) + A 2 ) 


with a good finite sample performance is proposed. Roughly speaking, it consists of a slight 
modification of the ordinary least squares estimator and a discretization of a continuous signed 
measure with the asymptotic optimal density in (12. lip . 

We assume that the experimenter can take IV+ 2 observations with N observations inside the 
interval [a, b]. In principle, any probability measure on the interval can be approximated by 
an (lV + 2)-point measure with weights l/(lV + 2) and similarly any finite signed measure can 
be approximated by an (lV + 2)-point signed measure with equal weights (in absolute value). 
We hence could use a direct approximation of the optimal signed measures of the form (12.121) 
by a sequence of (lV + 2)-point signed measures with equal weights (in absolute value). For 
an increasing sample size this sequence will eventually converge to the optimal measure of 
Theorem l2.31 However, this convergence will typically be very slow, where we measure the speed 
of convergence by the differences between the variances D(£) of the corresponding estimates 
and the optimal value D* defined in (12.161) . The main difficulty lies in the fact that a typical 
optimal measure has masses at the boundary points a and b, in addition to some density on 
the interval (a, b ). The convergence of discrete measures with equal (in absolute value) weights 
to such a measure will be very slow, especially in view of the fact that in our approximating 
measures the points cannot be repeated. Summarizing, approximation of the optimal signed 
measures by measures with equal weights is possible but cannot be accurate for small N. 

In order to improve the rate of convergence we propose a slight modification of the ordinary 
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least squares procedure. In particular, we propose a WLSE with weights at the points a 
and b (the end-points of the interval [a, b]), which correspond to the masses P a and P 5 of the 
asymptotic optimal design. We thus only need to approximate the continuous part of the 
optimal signed measure, which has a density on (a, b), by an N -point design with equal masses. 
To be precise, consider an optimal measure of the form (12.12)1 . We assume that the density p(-) 
is not identically zero on the interval (a, b ) and choose the constant c such that J^ \p(t)\dt = 1 . 
Note that unless p(-) changes sign in (a, b), we can choose p(t) > 0 for all t G (a, b). Define 
ip(t) = \p(t)\ for t G (a, b) and denote by F(t) = f*p(s)ds the corresponding distribution 
function. The Appoint design we use as an Appoint approximation to the measure with density 
tp(t) is £ n = {t 1>N ,... ,t NjN ] 1/AT,..., 1/AT}, where t i>N = F- 1 (z i>N ) with z ijN = i/(N + 1), 
i — 1, 2,..., AT. If pit) = 0 on a sub-interval of [a, b] and is not uniquely defined then 

we choose the smallest element from the set as t^N- Finally, the design we suggest 

as an (A^+2)-point approximation to the optimal measure in (12.12)) is 

£n+ 2 = Pad a + Pb$b + P£n, 

where P = 1 - |P 0 | - \P b \, £ N = {h,N, • • •, tjv, n) s 1iN /N, ..., s n , n /N} and s itN = sign (p(t ijN )), 

i = 1,..., N. 

The matrix W, which corresponds to the design ^+2 and is used in the corresponding WLSE 
(II. 3 j) . is a diagonal matrix = diag(AhP a , si^P, S 2 ,nP, ■ ■ ■, sn,nP, NPb) of size (A^ + 2)x 
(A^ + 2). The set of A^ + 2 design points, where the observations should be taken, is given by 
{a, t 2 ,iV) • • •, tjv.jv, b} and the resulting estimate is defined by 

0 W lse,n = (X T WjvX) - 1 X T WjvY. (2.17) 

It follows from (11.41) . (12.15j) and the discussion of the previous paragraph that 

lim Var(0 WLSE N ) = lim D(C N+2 ) = D*, 

N —>00 N->- 00 

where D* is defined in (12.141) . 

2.6 Some numerical results 

Consider the regression model (12.11) with f(t) = t 2 + 1, t G [1,2], where the error process 
is given by the Brownian motion. The optimal design for this model can be obtained from 
Table |3l and we have P a = 0 , Pb = —0.55, P = 0.45 and p(t) = 1.38/(t 2 + 1). By computing 
the quantiles from the c.d.f. corresponding to p we can easily obtain support points of [N+ 2)- 
point designs. For example, supp (£ 4 ) = {1,1.24,1.56,2}, supp (£ 5 ) = {1,1.18,1.39,1.65,2} 
and supp (Q) = {1,1.14,1.30,1.49,1.71, 2}. 

I 11 Figured] we display the variance of various linear unbiased estimators for different sample 
sizes. We observe that the variance of the WLSE defined by (12.171) for the proposed (A r +2)-point 
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design ££ r+2 is slightly larger than the variance of the BLUE for the proposed (N + 2)-point 
design, which is very close to the variance of the BLUE with corresponding optimal (7V+2)-point 
design. The calculation of these designs is complicated and has been performed numerically by 
the Nelder-Mead algorithm in MATLAB. We also note that due to the non-convexity of the 
optimization problem it is not clear that the algorithm Ends the optimal design. However, by 
Theorem 12.21 and 12.31 we determined the optimal value (I2.14p . which is D* ~ 0.075004. This 
means that for the proposed designs WLSE has almost the same precision as BLUE. 



Figure 1: The variance of the WLSE defined in (12.171) for the proposed (N + 2)-point designs 
Cn +2 (crosses), of the BLUE for the proposed (N + 2 )-point designs (grey circles) and of the 
BLUE with corresponding optimal (N+2)-point designs (line). The error process in model (12.11) 
is given by the Brownian motion and the regression function is fit ) = t 2 + 1, t e [1, 2], 


In our second example we compare the proposed optimal designs with the designs from lSacks and Ylvisaker 
), which are constructed for the BLUE. For this purpose we consider the model (12.11) with 


f 19661 )) 


regression function fit) = 1 + 0.5sin(27rf), t 6 [1,2], and triangular covariance kernel of the 
form (12.41) with u(t ) = t 2 and v(t) = t. The optimal design in the continuous time model can 
be obtained from Table 0] and its density is depicted in Figure [2] 



Figure 2: The density of the optimal design for continuous time model (12.ip with regression 
function f(t ) = 1 + 0.5 sin (2irt), t e [1,2], and covariance kernel of the form (12.4p withuft ) = t 2 
and v(t) = t. 

By computing quantiles using this optimal design, we obtain that the 4-point design £4 is 
supported at points 1, 1.27, 1.68 and 2. For £ 4 , the variance of the BLUE is ~ 0.6129. Using the 
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optimal density from [Sacks and Ylvisaker (1966)), we obtain the 4-point design supported 


at 1, 1.25, 1.63 and 2. For £4 , the variance of the BLUE is ~ 0.6200. For N = 2, 3,..., 20, the 
varia nces o f the BLUE for the proposed (N + 2)-point designs, the (N + 2)-point designs from 
Sacks and Ylvisaker ( 1966 1 and the optimal (N + 2)-point designs for the BLUE are depicted 


in Figure [3l We observe that for N = 2,3,4 the new desi gns y ield a smaller variance of the 
BLUE, while for N — 5 the design of Sacks and Ylvisaker (1196611 shows a better performance. 
In all other cases the results for both designs are very similar. In particular, for N > 6 the 


variances from th e op 


of Sacks and Ylvisaker 


imal (IV + 2)-point designs proposed in this paper and in the paper 
( 19661 ) are only slightly worse than the variances of the BLUE with 


corresponding best (IV+2)-point designs (which is computed by direct optimization). 



Figure 3: The variance of BLUE for the yroyosed (N+2)-point designs (grey circles), the (IV+2)- 
point designs from Sacks a,nd, Yl,visa ken (1 966 ) (crosses) and the BLUE with corresponding 
optimal (. N + 2)-point designs (line) for the model f[t ) = 1 + 0.5sin(27r£), t G [1,2], and the 
covariance kernel with u(t) = t 2 and v(t ) — t; N — 2,... ,20. 


3 Multi-parameter models 

In this section we discuss optimal design problems for models with more than one parameter. 
The structure of this section is somewhat similar to the structure of Section [2j In Section 13.11 
we introduce a new class of linear estimators of the parameters in model (II.3ft . which we call 
matrix-weighted estimators (MWE) and show in Lemma 13.31 that for some special choices of 
the matrix weights the MWE can always emulate the BLUE. In Section 13.21 matrix-weighted 
designs associated with the MWE are defined. Then, for the case of triangular kernels, in 
Corollary 13.11 we derive the asymptotic forms for the sequence of designs that are associated 
with the version of the MWE which emulates the BLUE. In Section [3731 we prove optimality of 
the asymptotic matrix-weighted measure derived in Corollary 13.II in the continuous time model 
(O (see Theorem 13.II) . while some examples of asymptotically optimal measures are provided 
in Section 13.41 Finally, the practical implementation of the asymptotic measures is discussed 
in Section 13.51 and numerical examples are provided in Section 13.61 
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The proofs of many statements in this section use the results of Section [21 This is possible 
as there is a lot of freedom in choosing the form of the MWE to emulate the BLUE and we 
choose a special form which could be considered as component-wise SLSE. Correspondingly, 
the resulting matrix-weighted designs (including the asymptotic ones) become combinations of 
designs for one-parameter models. 

3.1 Matrix-weighted estimators and designs 

Consider the regression model (II. ip and assume that N observations at points tj (j — 1,... ,N) 
have been made. Let O j be an m x m matrix associated with the observation point tj ; j = 
1 ,...,N. Recall the definition of the design matrix X = (/*(h'))i^i’,^ and the definition of 
Y = ..., y(t]sr)) T ■ We introduce the m x N matrix C = (Oi/(£i),..., Ojv/(Uv)) 5 whose 

j-tli column is O jf(tj). Assuming that the m x m matrix 

N 

M = CX = £o j f(t j )f T (t 1 ) (3.1) 

3 = 1 

is non-singular we define the linear estimator 

Omwe = (CX) _1 CY (3.2) 

for the vector 0 in model (II.IK . We call this estimator the matrix-weighted estimator (MWE), 
because each column of the matrix X is multiplied by a matrix weight. It is easy to see that 
for any C the MWE Omwe is unbiased and its covariance matrix is given by 

Var (Omwe) = M' 1 CSC T (M- 1 ) T , (3.3) 

where S = (JL(U,t,))jj =1 is the NxN matrix of covariances of the errors. Note that 
the matrix M defined in (13.ip generalizes the standard information matrix X T X and that 
M is not necessarily a symmetric matrix. The following result shows that different matrices 
Oi,..., On may yield the same matrix-weighted estimator Omwe • Its proof is obvious and 
therefore omitted. 

Lemma 3.1 Consider the regression model (II.ip and assume that the matrix M defined in 
(13.ip is non-singular. Then the estimator Omwe defined in (13.2j) coincides with the estimator 
Omwe,a = (C a X) -1 C a Y , where C A = AC and A is an arbitrary non-singular m x m matrix. 

The estimator Omwe,a introduced in Lemma 13.11 is the MWE defined by the matrix weights 
AOi,..., AOat. Lemma 13.11 implies that the Omwe is exactly the same for any set of matrices 
{AOi,..., AOjv} as long as A is non-singular. In the asymptotic considerations below it will be 
convenient to interpret the combination of the set of experimental conditions {ti,... ,tjv} and 
the set of corresponding matrices {Oi,..., On} in the MWE as an Appoint matrix-weighted 
design. 


16 

















Definition 3.1 Any combination of N points {6 ,... ,t N } and m x m matrices {Oi,.... O n } 
will be called N-point matrix-weighted design and denoted by 

£,n = {ti, • • •, t N ] ^Oi,..., — 0^} . (3.4) 

The covariance matrix D(£tv) of a matrix-weighted design is defined as the covariance matrix 
\cVs:{9mwe ) in (13. 3 p of the corresponding estimate Omwe- 

The estimator Omwe is not necessarily a least-squares type estimator; that is, it may not be 
representable in the form (II.3p for some N x N weight matrix W and hence there may be no 
associated weighted sum of squares which is minimized by the MWE. However, for any given 
W, we can always find matrices Oj such that 

C = X T W (3.5) 

and therefore achieve Omwe = Owlse■ The following result gives a constructive solution to the 
matrix equations (13.5^ . 

Lemma 3.2 Assume that fi(t) ^ o for all t G [a, b\. Define O j = ujjef, 

Uj = Wy(X T W), € *r , (3.6) 

where e\ = (1, 0,..., 0) T G is the first unit vector and (X 3 W)j denotes the j-th column of 
the mx N matrix X 7 W. Then the corresponding matrix-weighted estimator satisfies Omwe — 
Owlse- 

Proof. The matrix equation (13.5p can be written as N vector equations 

Ojfftj) = (X T W)j; j — 1,..., N, (3.7) 

with respect to the matrices O j. Assume that O j = c Ojef for some c Oj G M m . Then 

O jf(tj) = ujelfitj) = Ujfcftj) 

and equation (13.7p has the unique solutions (13.61) . □ 

The form Oj = cojef for the matrices Oj considered in Lemma 13.21 means that the matrix Oj 
has the vector ujj as its first column while all other entries in this matrix are zero. We shall 
refer to this form as the one-column form. We can choose other forms for the matrices Oj, 
but then we would require different, somewhat stronger, assumptions regarding the vector fit). 
For example, if f(t) ^ (0,..., 0) T for all t G [a, b], then we can always choose diagonal matrices 
Oj to satisfy (j3.5j) (see Lemma IH31 below). 
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The following choices for O,, ensure coincidence of 9 MWE with the three popular estimators 
defined in the Introduction. 

If O j = l m for all j, then 9 M we = 9 0 lse- 
If O j = Sjl m for all j, then 9 M we = Oslse- 

If W = 5U 1 and Oj = UjeJ with c Oj = (X T S _1 )j//i(fj), then 9mwe = 9 E lue- 

We shall call any MWE 9mwe optimal if it coincides with the BLUE. In view of the importance 
of the last case, the corresponding result is summarized in the following lemma. 

Lemma 3.3 Consider the regression model (11.11) and let f\ (t) ^ o for all t G [a, b]. For a given 
set of N observation points {U,... ,t at} the MWE 9 MWE defines a BLUE if O j = uj* ef with 
W* = (X T S _ 1 )j/ fi(tj). 

If the covariance kernel of the error process has triangular form (j2.4(1 then we can derive the 
explicit form for the optimal MWE. The result follows by a direct application of Lemma IA.1I 


Lemma 3.4 Assume that the covariance kernel K(-, •) has the form (j2.4[i and that the matrix 
S = (K{t h tj))i, j= i„. v jv is positive definite with entries Ii(ti,tj ) = uWj for i < j, where for 
k = 1 ,N we denote Uk = u(tk), Vk = v(tk) and qk = Uk/vk■ Then we have the following 
representation for the optimal vectors u* = (X T S -1 )j//i(fj) e introduced in Lemma \3.A 


w, = 


o>n ~ 


c 


(cr u f(ti) + a 12 f{t 2 )) = 


cu 2 


fi(ti)viv 2 (q 2 - qi) V «i 


fi(t 


N 


{&N,N f (^iv) + <TV-l,iv/(Uv-l)) 


f{tN) f(tN- 1) 


OJi = 


fl(tN)vN(qN ~ QN-i) V V N Mjv-i 

Q 

(&i,if ifi) T (?i—l,if {fi— l) T &i : i-\-lf (U+l)) 


C 


(?*+1 - Qi-i)f{U) 


f(ti-1 ) 


f(h) f{t 2 ) 


u 2 


f{U+ 1) 


fi(ti)Vi \Vi(q i+1 - qf)(qi - ft-i) u i _ 1 (g i - %-i) v i+ i(%+i - ft) 


(3.8) 

(3.9) 

(3.10) 


for i = 2,..., N — 1. if ere m formulas (13. 8 j) . (13. 9 j) and (13.101) cr^ denote the elements of the 
matrix S” 1 = (of®j)*,j'=i,---,JV- 


The following provides a result similar to Lemmas 13.21 and 13.31 in the case where the matrices 
Oj are diagonal. An extension of Lemma 13.41 to the matrices Oj of the diagonal form is 
straightforward and omitted for the sake of brevity. 

Lemma 3.5 Consider the regression model (11. ID and let fk{t) ^ 0 for all t G [a, b] and all 
k — 1,..., m. For each j = 1,... ,N, define the diagonal matrix Oj by its diagonal elements 

(O j) k ,k = T77r( xTw )fcj; k = l,...,m, 
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where (X r W) fcj - denotes the ( k,j)-th element of the matrix X r W. Then d MWE = 9 WLSE . 
If additionally W = 5X 1 so that (0 3 ) k , k = (X T S“ 1 ) fej // i .(t i ) ; then 9 MWE = 9 BLUE . 


3.2 Weak convergence of matrix-weighted designs 

Let Q : [a, b\ —» [0,1] be an increasing function on the interval [a, b] with Q(a ) = 0 and Q(b ) = 1 
so that Q(-) is a c.d.f. For a fixed N and j — 1,. .., N, define the points t\ t N ,..., Lv,v by (12.9H . 
Suppose that with each t e [a, b] we can associate an mx m matrix O(t) and consider an 
Appoint matrix-weighted design £ N of the form (13.41) with t 3 = t 3N and 0 3 = O (tj <N ). In view 
of (13. ip and (13.31) this design has the covariance matrix 

d(6v) = m~\^ n )b^ n ) (m-\£ n )) t , 


where the matrices M(£jv) and B(£n) are defined by 

1 N 

M (6v) = — 

3= 1 

J JV JV 

B(£tv) = y ' K(ti, n ,tj tn )0(ti !n )f(ti t n)f (tj t n) O (tj 3 f). 

i =1 j=l 

In addition to the sequence of matrix-weighted designs £at consider the sequence of uniform 
distributions on the set ... ,t NtN }. As N —* oo, this sequence converges weakly to the 

design (probability measure) ( on the interval [a, b] with distribution function Q. This implies 


lim M(6v) = M(f) = / 0(t)f(t)f T (t)C(dt) 

N ^°° Ja 

lim B(6r) = B(£) = I' f K(t,s)0(t)f(s)f T (t)0 T (s)((dt)((ds), 
^°° da Ja 


and 


lim D ( 6 v ) = D(0 = ( 3 . 11 ) 

N->o o 

under the assumptions that the vector-valued function /, the matrix-valued function O, the 
kernel K are continuous on the interval [a, b] and the generalized information matrix M(£) are 
non-singular. Moreover, the sequence of estimators ( 13 . 2 p converges (almost surely as N —> oo) 
to 

9mwe,oo — M - 1 ^) [ b O(t)f(t)y(t)((dt), ( 3 . 12 ) 

J a 

where {y(t) \ t e [a, 6 ]} is the stochastic process in the continuous time model (11.21) . Bearing 
these limiting expressions in mind we say that the sequence of matrix-weighted designs £n 
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defined by (j3.4j) converges to the limiting matrix-weighted design f(dt) = O (t)((dt) as N —» oo. 
This relation justihes the notation M(£),B(£) and D(£) of the previous paragraph. 

The (optimal) limiting matrix-weighted designs which will be constructed below will have a 
similar structure as the signed measures in (12.12)1 . They will assign matrix weights O a and O b 
to the end-points of the interval [a, b] and a ‘matrix density’ Oft) to the points t £ (a, &); that 
is, these designs will have the form 


£(dt) = O a 5 a (dt) + O b S b (dt) + O (t)dt. 


(3.13) 


In view of (13.12H . the MWE in the continuous time model (11,2)1 associated with any design of 
the form (|3.13[) can be written as 


Omwe(£) = M x (£) O a f(a)y(a) + O b f(b)y(b) + / O (t)f(t)y(t)dt 


(3.14) 


where M(£) = O a /(a)/ T (a) + O h f{b)f T {b) + 0{t)f(t)f T {t)dt. In the particular case asso¬ 
ciated with Lemma 13.41 we have the following structure of the matrices O a and O b and the 
matrix function O (t) in (I3.13p : 


O a = uj a e^, O b = co b ef, O(t) = uj(t)e[ for t £ (a, b ), (3.15) 


where oj a and u b are some m-dimensional vectors and u(t) £ M m is some vector-valued function 
defined on the interval (a, b). Note that uj{t) does not have to approach u a and ui b as t —* a 
and t —* b, respectively. 

When the sequence of matrix-weighted designs is defined by the formulas of Lemma 13.31 we can 
compute the limiting matrix-weighted design. The proof follows by similar arguments as given 
in the proof of Theorem 12.11 and is therefore omitted. 


Corollary 3.1 Consider model (11.ip . where the error process {er(t) 1 1 £ [a, &]} has a covariance 
kernel K of the form (12.411 . Assume that u(-), v(-), q(-) are strictly positive, twice continuously 
differentiable functions on the interval [a, b] and that the vector-valued function /(•) is twice 
continuously differentiable with f\(t) ^ 0 for all t £ [a, b]. Consider the matrix-weighted design 
(pv of the form (13.4p . where the support points tj = tj^ are generated by (12.9p and the matrix 
weights O j = Oj t N ore defined in Lemma fXM The sequence {£at}tvgn converges (in the sense 
defined above in the previous paragraph) to a matrix-weighted design £ defined by (13.13p and 
(13. 1 5p with 


uj, 


fi(a)v 2 (a)q'(a ) u(a 


f{a)u'(a) 


- f(a) 


uj b -- 


c h'{b) 


fi{b)v{b)q'{by 


u(t ) : 


h (t)v(t) 


m 

q'{t) 


(3.16) 


where h(t ) = f{t)/v{t) and the constant c ^ 0 is arbitrary. 
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In Corollary 13.11 the one-column representation of the matrices O j is used. The following 
statement contains a similar result for the case where the matrices Oj are diagonal. 


Corollary 3.2 Let the conditions of Corollary \S. 1\ hold and assume additionally that fk(t) ^ 0 
for all t G [a, b] and all k — 1,..., m. Consider the matrix-weighted design £at of the form 03.41) . 
where the support points tj = t 3 ,N are generated by 02. 9 p and the matrices Oj = Oj.N are defined 
in Lemma 1X31 with diagonal elements given by (O f)k,k = (X T S _ 1 )fcj/ fkitf). Then the sequence 
{£jv}tvgn converges to the optimal matrix-weighted design £* of the form 03.131) . where the 
diagonal elements of the matrices O a = diag(O a;11 ,..., O ajtnm ), O b = diag(0^n,..., O h,mm) 
and Oft) = diag(O n (f),..., O mm (t)) are given by 

q'{t) _ 

respectively, hj(t ) = fj(t)/v(t), j = 1 ,... ,m and the constant c ^ 0 is arbitrary. 


° a ' n /,(a)v 2 (a)g'(a) 


fj(a)u (a) 


u(a 




O 


c hAb) 


b,jj : 


fj(b)v(b)q'(b) 


i Ojj(t) : 




3.3 Optimal designs and best linear estimators 

In this section we consider again the continuous time model 01.2j) . where the full trajectory of the 
process {y(t)\ t G [a, b}} can be observed. We start recalling some known facts concerning best 
linear unbiased estimation. For details we refer the interested reader to the work of Grenander 
(11950 1 or Section 2.2 in INath er (1l985af ). Any linear estimator of 9 can be written in the form 
of the integral 


On = / y{t)v{dt), 


(3.17) 


where fi{t) = ..., fi m (t)) T is a vector of signed measures on the interval [a, b). For given 

/i, the estimator 9 M is unbiased if and only if f b f (t) i i T (dt) = I, m , where I m denotes the m- 
dimensional identity matrix. Theorem 2.3 in iNhther ( 1985a ) states that the estimator 9 is 
BLUE if and only if J^ f(t)/i* T (dt ) = I m and the identity 

[ K{u,v)p*(dv) = A f{u) 

J a 

holds for all u G [a, b], where A is some m x m matrix. The matrix A is uniquely defined and 
coincides with the matrix 


D* = Var (9^*) = inf 



K(u, v)y{du)fi T (dv) 


/a vector of signed measures 


(3.18) 


The Gauss-Markov theorem further implies that D* < Var (9), where 9 is any other linear 
unbiased estimator of 9. 
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Definition 3.2 A matrix-weighted design £* is called optimal if D(£*) = D*, where D(£) is 
defined in 03.111) and D* is defined in 03.181) . 


The designs we consider have the form 03.131) and the corresponding MWE are expressed by 
03.141) . The estimator 03.141) can be expressed in the form 03.171) . that is 9mwe( £) = 9^ with 

p(dt) = M -1 (£) [O a f(a)S a {dt) + O b f(b)6 b {dt) + O (t)f(t)dt] . 

The estimators defined in (13.14p are always unbiased and the following result provides the 
matrix-weighted optimal design and the BLUE in the continuous time model 01.2p . The proof 
follows by similar arguments as given in the proof of Theorem 12.21 and 12.31 and is therefore 
omitted. 


Theorem 3.1 Let K(t , s ) be a covariance kernel of the form 02.4p and the vector-function /(•) 
be twice continuously differentiable with f\ (t) 0 for all t G [a, b\. Under the assumptions 

of Corollary [Q the matrix-weighted design £* defined by the formulas 03.131) and 03.16j) with 
c = 1 is optimal in the sense of Definition 3.2. Moreover, if 


p*(dt) = M 1 (C)[^ a ei f(a)S a (dt) + u b e\ f{b)8 b (dt) + u{f)e{f (t)dt\ , 


then 9defines the BLUE in model 01.21) . Additionally, we have 
D(t) = D* = f f K(s,t)/i*(ds)fi*(dt) = 

J a J a 

where the matrix M(£*) is given by 


M(r) 


/(g(«))/ r (g(«)) 

q(a) 


rq d) 


+ 


lq(a) 


f'(s)f T (s)ds 


and f(s) = f(q \s))/v(q 1 (s)). 


In Theorem 13.11 we have used the one-column representation for the matrices O (t). Similar 
arguments establish the optimality of the matrix-weighted designs £* defined in Corollary 13.21 
where the diagonal representation for the matrices O(f) is used. The details are omitted for 
the sake of brevity. 


3.4 Examples of optimal matrix-weighted designs 

Consider the polynomial regression model with f{t) = (1 ,t,t 2 ,... ,t m ~ 1 ) T , t G [a, b] and the 
covariance kernel of the Brownian motion K(t,s ) = min(t,s). For the construction of matrix- 
weighted designs we use matrices O (f) in the one-column and diagonal representations. 
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For the one-column representation we have from Corollary 13.11 and Theorem l3.1l that the optimal 
matrix weighted design has masses O a = cu a ef and Ch = cj(6)ef at points a and 6, respectively, 
and the density O (t) = Here the vectors co a ,cOb and co(t) are given by 

uj a = (1/a, 0, -a,..., (2 - m)a m ~ 2 ) T , 

Ub = (0,1, 2 6, 3b 2 ,..., (m — 1 )b m ~ 2 ) T , 
to(t) = (0, 0, —2, — 3 • 2t,..., — (m — 1 ){m — 2 )t m ~ 3 ) T , t G (a, b ), 

respectively. For the diagonal representation we have from Corollary 13.21 (and an analogue of 
Theorem 13. ip that the optimal matrix weighted design has masses O a and O& at points a and 
b, respectively, and the density O (t), where 

O a = diag(l/a, 0, —1/a,..., (2 — m)/a), 

O b = diag(0,1/6, 2/6 ,..., (m - l)/6), 

O (t) = diag(0, 0, — 2 /t 2 ,..., — (m — l)(m — 2 )/t 2 ), fe(a,6 ). 

Note that in this case all non-vanishing diagonal elements of the matrix O (t) are proportional 
to the function 1/t 2 . According to Lemma 13.11 we can use AO(t) instead of O(t), for any 
non-singular m x m matrix A. By taking the matrix 


A = diag(l, 1, -1/2,..., -l/[(m - l)(m - 2)]), 

we obtain 

AO (a) = diag(l/a, 0, l/(2a),..., l/[(m — 1) a]), 

AO( 6 ) = diag(0,1/6, -1/6,..., —l/[(m - 2)6]), 

AO(f) = diag( 0 , 0 , 1 /f 2 ,..., 1 /f 2 ), t 6 (a, 6 ). 

As another example, consider the polynomial regression model with f(t ) = ( 1 , t, t 2 , ..., t m ~ l ) T , 
t G [a, 6 ], and the triangular covariance kernel of the function ( 12 . 4 p with u(t ) = f 7 and v(t) = t w . 
For the diagonal representation we have from Corollary 13.21 that the optimal matrix weighted 
design has masses O a and O 5 at points a and 6 , respectively, and the density O (t), where 

O a = a - 7 -w diag(— 7 ,1 - 7 , 2 - 7 ,..., m - 1 - 7 ), 

Ob = 6 _ 7 _ “diag(o;, to — 1 , lu — 26,..., to + 1 — m), 

O(f) = f _ 1 _ 7 _ ^diag(ri,r 2 ,... ,r m ), t e (a, 6 ), 

with Tj = {i — 1 — 7 )(i — 1 — co), i — 1,... ,m. If we further use A = ding^/iy, l/r 2 ,..., l/r m ) 
then we obtain AO(t) = f' 1 _ 7 _ ‘ J diag(l, 1,..., 1), t e (a, 6 ); that is, all components of the 
matrix AO (t) have exactly the same density. 
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3.5 Practical implementation 

Here we only consider the diagonal representation of the matrices O a , O 5 and O (i); the case 
of one-column representation of the matrices O can be treated similarly. We assign matrix 
weights O a and Ch to the boundary points a and b and use an Appoint, approximation to an 
absolutely continuous probability measure on (a, b ) with some density (fit). The density cp(i) is 
defined to be either the uniform density on (a, b) (if nonzero elements of different components 
of O(t) are not proportional to each other) or cp(t) = c|0^;(t)| for some l E {1 (if 

nonzero elements of different components of O (t) are proportional to each other), where c 
is the normalization constant and l is such that the density (p(t) is not identically zero on 
the interval (a, b). Denote by F(t) = f* tp(s)ds the corresponding c.d.f. For given N, we 
calculate an Appoint approximation • • •, tv,Ad 1/Af, ..., 1/-ZV}, where t^ N = F~ l (z it N) 

with — i/(AT + 1), i = 1, 2,..., AT, to the probability measure with density tp(t). 

To each point t h N we assign a vector of weights s 3 = (s^jv.i, ■ ■ ■ j Sj,Jv,m) T such that syjvy £ 
{—1,0,1} (k — 1,... ,m). The values Syjvy- = sign(0 k,k(tj)) = ±1 correspond to the sign of 
the point tj.N in the estimation of 6k exactly as in the procedure for one-parameter models 
described in Section 12.51 Some of the values syjvy could be 0. If syjvy = 0 for some k then 
the point t 3)N is not used for the estimation of 9 k . By assigning zero weight to a point tj jN 
in the k-th estimation direction, we perform a thinning of the sample of points fyjv, ■ ■ ■, tjv.AT 
in fc-th direction and thus achieve a recpiired density in the each estimation direction. This is 
a deterministic version of the well-known ‘rejection method’ widely used to generate samples 
from various probability distributions. If the nonzero components of the matrix weight O (t) 
are proportional to each other then for these components Syjvy = 1 for all j and N. 

The resulting estimator 9 has the form (13.21) where 

C = (^NOaf(a), S 1 P/(t 1 )..., S N Pf(t N ), A r Ofe/( 6 )), 
s j = diag(syjv,i, • • •, Syjv.m) e R mxm 

and P is the diagonal m x m matrix whose diagonal elements are given by 

N f b 

Pfc,fc ^jy | j" / Qk,k(t')dt. 

2^j =1 J a 

If nonzero elements of different components of the matrix weight O (t) are proportional to each 
other (as was the case in the examples of Section l3~4]) then the (A r +2)-point approximations to 
the limiting design are very similar to the approximations in the one-parameter case considered 
in Section ESI their accuracy is also very high. Otherwise, when the diagonal elements of O (t) 
are possibly non-proportional, the accuracy of approximations will depend on the degree of 
non-homogeneity of components of the matrix weight O (t). 
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3.6 Some numerical results 


For comparison of competing matrix weighted designs for multiparameter models it is con¬ 
venient to consider a functional of the covariance matrix. Exemplarily we investigate in this 
Section the classical 17-optimality criterion defined as T(D(£)) = (det D(£)) 1//m which has to 
be minimized. 

As an example where all nonzero elements of the matrix O (t) are proportional to each other, 
let us consider the cubic regression model with f(t) = (1 ,t,t 2 ,t 3 ) T and the Brownian motion 
error process. The optimal value in the continuous time model (1 1.2 j) is T(D*) ~ 2.7927. In 
Figure 0] we display the /7-criterion of the covariance matrices of the MWE and the BLUE for 
the proposed (At+2)-point designs and the covariance matrix of the BLUE with corresponding 
optimal (At+2)-point designs. We can see that the //-efficiency of the proposed matrix-weighted 
design is very high, even for small N. 



Figure 4: The D - optimality criterion of the covariance matrix of the MWE for the proposed 
(N+2)-point designs (crosses), of the covariance matrix of the BLUE for the proposed (N-\-2)- 
point designs (line) and of the covariance matrix of the BLUE with corresponding D-optimal 
(N-\-2)-point designs (grey circles). The error process in model (11.11) is the Brownian motion 
and the vector of regression functions is given by f(t ) = (1 ,t,t 2 ,t 3 ), t G [1,2]. 

The second example of this section considers a situation where nonzero elements of the matrix 
O(/) are not proportional to each other. For this purpose we consider the model (12. ip with 
f(t) = (1, t, t 2 ) T , t G [1, 2] and covariance kernel K(t, t') = ewith u(t) = e b and v(t) = e _t . 
Using the diagonal representation, we obtain for the optimal matrix-weighted designs 

O a = diag(l, 0, -1), Ob = diag(l, 1.5, 2), O(t) = diag(l, 1,1 - 2/t 2 ), t G (1, 2). 

The optimal value in the continuous time model (11.21) is given by T(D*) ~ 1.6779. Since some 
diagonal elements of O (/) are constant functions, we take the support points of the design £,n +2 
to be equidistant: t it N = i/(N + l) for i = 1, ..., N. Then we have Sj t N,k = 1 for all j — 1,..., N 
and k — 1,2. However, some elements of (si^. 3 , • • • ,sn,n, 3 ) should be zero because 0 3)3 (f) is 
not proportional to 0 11 (t). For example, for N = 10 the vector of signs (s 1 ^, 3 , • • •, Sn.nj) is 
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(—1, —1, 0, 0, 0,1, 0, 0,1, 0) and for N = 30 it is 

(- 1 , 0 , - 1 , - 1 , 0 , - 1 , 0 , 0 , - 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 , 0 , 0 , 0 , 1 , 0 , 0 , 1 , 0 , 0 , 1 , 0 , 1 ). 

In Figure Owe depict the ZUoptimality criterion of the covariance matrices for various estima¬ 
tors. We observe that in this example for all N the ZUoptimality criterion of the covariance 
matrices of the MWE is slightly larger than the ZUoptimality criterion of the covariance ma¬ 
trices of the BLUE. However, we can also see that the proposed (IV-|-2)-point designs are very 
efficient compared to the BLUE with corresponding H-optimal (A^ + 2)-point designs even for 
small N. 



Figure 5: The D-optimality criterion of the covariance matrix of the MWE for the proposed 
(IV+2) -point designs (crosses), of the covariance matrix of the BLUE for the proposed (N+ 2)- 
point designs (line) and of the BLUE with corresponding D-optimal (N+2)-point designs (grey 
circles). The covariance kernel in model (11.11) is K(t,t') = and the vector of regression 

functions is f(t ) = (1 ,t,t 2 ), t G [1,2]. 
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A Proof of main results 

A.l Explicit form of the inverse of the covariance matrix of errors 

Here we state an auxiliary result, which gives an explicit form for the inverse of the matrix X = 
( K(ti , with a triangular covariance kernel K. We did not find this result (as formulated 
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below) in the literature. Versions of Lemma lA.ll however, have been derived independently 
by different authors; s e e, for example, Lemma 7.3.2 in IZhigliavskvl ()199ll) and formula (8) in 


Harman and Stulajter (20111). The proof follows from straightforward checking the condition 


E _1 E = 


I. 


Lemma A.l Consider a symmetric N x N matrix E = (<Aj)ij=i,...,jv which elements are 
defined by the formula = UiVj for 1 < i < j < N. Assume that q\ < q 2 < ■ ■ ■ < Qn 
where q^ = Ui/vi. Then the inverse matrix E = E -1 is a symmetric tri-diagonal matrix and its 
elements with i < j can be computed as follows: 

~ _ u 2 _ 1 

°i,i — -7-7 J &n,n — — 2—7-7 ) 

U1V1V2 (q-i - qi) vjj ( q N - q N -i) 




Qi +1 — Qi -1 


V _ „,2 


ViiQi -9i-i)(ft+i - Qi) 


(* — 2, 


^2,2+1 


UiU i+1 (g i+1 - 


a = 1 , 


...,iV-l), 


cri,<+fc = 0 (i = 1, • • •, N - 2, k >2). 

In our applications of Lemma IA. 11 we assume that o~ij = K (£ l; tfi) with the covariance kernel K 
having the form (12.41) . 

A.2 Proof of Lemma 12.1 

Denote K l3 = K(U, tj), f(tf) = fi, ai = fiwi, i, j = 1,..., N, a = (a : , ..., a N ) T . Then for any 
signed measure £ = {£ 1 ,..., £tv; w x , ..., wn} we have 

K-ijfifjWiWj — E« Ej Kij a i a j — a 3 Ea 

“ (Ei /M ) 2 “ (E iM 2 ~ WW' 

Since E is symmetric and E > 0, there exists E” 1 and a symmetric matrix E 1 / 2 > 0 such that 
E = E 1,/2 E 1//2 . Denote b = E 1//2 a and d = E -1 / 2 f. Then we can write the design optimality 
criterion D(£) as D(£) = b T b/(b T d) 2 . The Cauchy-Schwartz inequality gives for any two 
vectors b and d the inequality (b T d) 2 < (b T b)(d T d), that is, b T b/(b T d) 2 > l/(d T d). This 
inequality with b and d as above is equivalent to D(£) > 1 /f 7 E~ 1 f for all £. Equality is 
attained if the vector b is proportional to the vector d; that is, if b % = cdi for all % and any 
c 0. Finally, the equality 6j = cdi can be rewritten in the form Wi = c(E _1 f)j//(£j). 
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A.3 Proof of Theorem 12.1 


Before starting the main proof we recall the definition of the design points (12. 9 ft and prove the 
following auxiliary result. 

Lemma A.2 Assume that q(-) = u(-)/v(-) is a twice continuously differentiable function on 
the interval [a, b\. Then for all i — 1,..., N — 1, we have 

w “ kK = WKA + 0 (tp) 85 N (AA) 

A„ = (W - «i-i,jv)/2 = jV(j4,iv) ( 1 + 0 Q) asN ^°°- < A ' 2 > 

Proof of Lemma IA.2L Recall the definition = (i — 1/2 )/N [i — 1,..., N ) and set 

m = q(a ) = min q(t) , M = q{b ) = max q(t). 

te[a,b] te[a,b] 

From the dehnition of the function Q in (j2.8j) we have 


q(U+ i,n) ~ q{U, n) = (M - m)(zi+i, N - z ijN ) = 


M — m 
N 


(A.3) 


oo. 


for all i — 1,..., N — 1. Observing Taylor’s formula yields for any z 

Q~\z + 5) = Q~\z ) + 5 • ( Q~ l )\z) + 0(S 2 ) as 5 -> 0. 

In this formula, set z = z^n and Sn = 1/N so that z + S = z i+ We thus obtain 

ti+l,N — ti,N = Q 1 (A+1,7V) = Q 1 { z i,n) + ~ ' {Q *) { z i,n) + a S N —> 

By using (12.9ft and the relation (Q~ x ) (z) = l/Q'{Q~ l {z )) we can rewrite this in the form (1A.1|) . 
The second statement obviously follows from (lA.ip . 

Proof of Theorem 12.11 In view of Lemma 12.11 and (j2.5[) - (12.7p we have 
Win = 


Wi,N ~ 


CnU 2 

(h 

h\ 

1 Wn,N 

cn 

( fN 

fN 

flV 1 V 2 

\U\ 

U 2 ) 

}nVn 

\V N 

Vn ■ 

cn ( 

Vi 

fi -1 

fi +1 \ 



AT 


ft 


where we have used the relations (IA.3p . Here c n is the normalization constant providing 


Ya= 1 l w i,iv| = 1 and we use the notation u t = u(t i>N ), v % = (t iiN ) and f t = f(t^ N ). 
Consider first w\ t N- Denote g(t) = f(t)/u(t), then 

w(t2,v) 


Wi,n/ c n — 




(9(tl,N) ~ 9^2 ,n)) , 


(A.4) 





































which gives 


w(^2,Af) 


f(ti,N)v(ti, N )v(t 2 ,N) f(a)v 2 (a) V + ViV 
g(t hN ) - g(a ) = g\a) (t 1>N - a) + O ({t 1>N - a) 2 ) = g\a) ■ 2 N q,^ + °(jp 


as N —> oo. Similarly 


5(i2, J v)- 9 W= 9 '(a)^^ + 0(i 5 ) 


yielding 


9 (i,, w )- 9 (n-v)=- 9 'W^y + o(^). 


Combining (IA.4I) . (1A.5I) and (lA.fij) we obtain 

W\,N 


c N 


1_ u(a)g'(a) / /I 

N' f(a)v 2 (a)Q'(a) \ \N 

u'(a) /'(a) 


iV v 2 (a)Q'(a) Ln(a) /(a). 


1 + 01 


as iV —> cx). Similarly to flA.7j) we get the asymptotic expression for wm,n'- 


wn,n _ 1 h'jb) / /I 

Cjv iV ' f(b)v(b)Q'(b) V ViV 


as N oo. Consider now the weights 

2 h(t itN ) — /r(C_ 1]A r) — /i(tj+i,iv) 


= Cjv 


f{U, N )v{t itN ) 


(i — 2,... ,N — 1). 


(A.5) 


(A.6) 


(A.7) 


(A.8) 


(A.9) 


Assume iV —)■ oo and i = i(iV) is such that i(N)/N = z + 0(1/N) as N —> oo for some 
z G (0,1), and set t = Q~ x (z). 

We are going to prove that 


Wi,N _ 2 h(ti t N) — h(ti- i,jv) — /i(^i+l,jv) 
CiV f(ti,N)v(t it N ) 

1 r/i'(t)Q"(i) 


(A.10) 


N 2 {Q'{t)) 2 f{t)v{t ) L Q'(f) 
1 r /i'(t) 

N 2 Q'{t)f(t)v(t) lQ'(t) 


- *>"(<)] ( 


1 + 01 


(l + O 


1 

w 


(A.ll) 
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First, in view of (I2.9j) we have t in = t + 0 (A) and hence 

f(t ijN )v(t iy N ) = f(t)v(t)(l + °(^)) as N ->■ oo . 

Consider the numerator in (1A. 10|) and rewrite it as follows: 

2 h(ti : N) — h(ti-i t isr) — h(ti + i,N) = [2/i(ii,jv) — — ^(C:+i,iv)] + 2 

where C, jV = (*»—i,jv + t*+i,w)/2 . We obviously have t i+1)N = t i)N + A N and i,jv = C;,v - Ajv, 

where A N = (t i+ i tN — i,at)/ 2 is defined in (1A.2I) . This yields 


2 h{tj )N ) - h(tj- i,Af) - h(fj + i,Af) = (i + Q j 


(A.12) 


Next we consider 


h(t it jy) — h(t i)N ) _ h{tj^) — h(tj t N ) tj,AT — 


A 2 


U,N — U,N 


A 2 


For the first factor we have 


ti,N ~ U,N 

while the second factor gives 

U,N — ti,N _ 2 — — ti-l,N 


Ky-KW =V( v 1 + 0 a 


AT 


A 2 


+i+l,iV — £i-l,jv) 2 

2 2Q (^*i,Jv) Q Q (+—1, JV) 


1/iV 2 


1/iV 


<5 1 (' 2: i+l,iv) — <5 1 (■2«i—l,Jv) 
1 \\ _. _ / 1 


= -2W- 1 )"W/(2(e-')'(z)) 2 (l + o(l)) = C"(0/(2Q'(i)) 


1 + 0 


AT 


where we have used the relation (Q 1 ) W (^) = — Q" {z) / {Q'{z)Y i n the last equation. This gives, 
as At —» oo. 


} h(U t N) ~ h(ti t N) _ 9 h(tj i 7v) — — ^i,jv _ h'(t)Q"(t ) ^ 1 


A 2 




A 2 


Q'it) 


N 


(A.13) 


Combining the expressions f 1A.2jl . flA.lOjl . (1A.12I) and (1A.13I) yields the asymptotic expression 
(I A. 11 1 ) for w iyN /c N . 

By noting that cat = NC (l + O (jj)) as At —* oo and that the asymptotic density of the points 
ti,N (i = 1, • • •, N) is Q'(t) on the interval [a, b], we deduce the statement of the theorem as 
a consequence of the asymptotic formulas flA.7jl . (1A.8I) and (1A. 11 j) for w^n/cn, wn,n/cn and 
Win/cn respectively. 
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A.4 Proof of Theorem 12.2 


By Theorem 3.3 in Dette et ah (120131 ) a design minimizes the functional (12.151) if the identity 



min (s,t)f(t)d£(t) = A f(s) 


(A.14) 


holds £ — a.e., where A is some constant. We consider the design £ = £* defined by (12.121) 
and (12.13ft and verify for this design the condition (1A.14I) . To do this we calculate by partial 
integration 


1 

c 



min(s, t)f(t)p(t)dt 



t(~/"(*))& + 





(t)dtj - s{/'(6) 


(af'(a) - f(a )) - sf\b) + f(s). 



Observing the dehnition of the masses in (12.131) . the identify (IA.14D follows with A = c. This 
proves the first part of Theorem 12.21 

For a proof of the second statement consider a linear unbiased estimator 6 in model (12.11) 
based on the full trajectory, where y*(dt) = f(t)£*(dt) and £* is the design in (12.121) . (12. 13ft 
with a constant c chosen such that 9 is unbiased, that is 


^r\a) 


c = 


+ / ( f\t)fdt 


i -i 


Standard arguments of optimal design theory show that /i* minimizes $ (that is, is BLUE 
in model (12. 1 1) where the full trajectory can be observed) if and only if the inequality 


$(/!» = 



K(x,y)n*{dx)v(dy) > $(/i*) 


(A.15) 


holds for all signed measures v satisfying f{t)v(dt) = 1. Observing this condition and the 
identity (1A. 141) we obtain 


r / 2 («) 


$OV) = c* / f(s)v(ds) = 


-l 


+ ( f'{t)fdt = $(//) 


for all signed measures v on [a,b] with f{t)v(dt) = 1. By (1A.15I) /i* minimizes <L. Conse¬ 
quently, the corresponding estimator 9 is BLUE with minimal variance 




D* = c* = 


+ / ( 


-l 
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A.5 Proof of Theorem 12.3 


Let {e(s)|s G [a,b\ be a Brownian motion on the interval [a,b\ and consider the regression 
model (12.11) with some function f(s) and the error process. By Theorem 12.21 the optimal design 
is given by 

i*(ds) = Plaids) + P b S b (ds ) +p(s)ds 


with 


Ph = c 


f(a) - f'{a)a 


Pi = c 


m 


and 


p(s) = 


J"(s) 


a f( a ) ' f{b ) f(s) 

We shall now use Theorem IB . 1 1 to derive the optimal design £*(dt) for the original regression 
model (12.ip with regression function f(t) and covariance kernel K(t,t') from the design £*(ds) 
for the function /(s) = h(g _1 (s)), where h(t) = f{t)/v(t). 

For the Brownian motion, the covariance function is dehned by (IB.4|) with v(t) = 1 and q(t ) = t 
so that by (IB.61) we have f3(t) = q(t), a(t) = v(t) and a(t) = l/u(g _1 (t)). According to (IB.ldl) 
the optimal design d£*(s) transforms to d£*(t) = a 2 (/3(t))d£*(/3(t)) = d£*(q(t))/v(t). 

Consider first the mass at b. We have P~ h = cf'{b)/f(b). By using the transformation of t into 
s = g _1 (f), we obtain 


P = 


P, 


= c- 


f(b) 


= c- 


h\b) 


= c- 


h\b) 


v 2 (b) ~f(b)v 2 (b) ~ q'(b)v 2 (b)h(b) q'(b)v{b)f(b) ’ 

as required. From the representation of P b we obtain by similar arguments 


p 

p — r<1 

a v 2 (a) 


= c 


h(a) — ah' (a)/q' (a) 
q'(a)v 2 (a)h(a) 


Let us now consider the density p(s), s G [a, 6], and rewrite d^*(j3(t)), the absolutely continuous 
part of the measure £*. The transformation of the variable s into t = q~ 1 (s) G [a, b] induces 
the density 


d£Z(P(t)) =p(q{t))q’(t) = -cq\t ) 




Differentiating the equality /(s) = h(g _1 (s)), we have 

/"(s) = (h'iq-^s)) ■ (q- l (s))')' = h"(q-\s )) • ((g _ 1 (s ))') 2 + h\q~\s)) • (g _ 1 (s))' 
Now we obtain 

/"(«(<)) = ttttL - h\t) ™ 


(A.16) 


(my 
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Inserting this into (IA.16D and taking into account that f(q(t )) = h(t), we obtain the density 


d£(P(t)) = c 


h{t)q'{t) 


h\t) ■ 


q'{t) 


) c 


> h(t) 

i 


(A.17) 


In view of the relation d£,*(t ) = a 2 (f3(t))df*(f3(t)) we need to divide the right hand side in 
(1A.17I) by v 2 {t) and obtain the expression for the density (12.111) . This completes the proof of 
Theorem 12.31 


B Gaussian processes with triangular covariance kernels 

B.l Extended Doob’s representation 

Assume that {e(t)| t G [a, 6]} is a Gaussian process with covariance kernel K of the form (12.4j) : 


that is, = u(t)v(t') for t < t r , where u(-) and v(-) are functions c 


[a, b\. According to the terminology introduced in Mehr and McFadden 


ehned on the interval 


(119651) kernels of the 


form (12. 4p are called triangular. An alternative way of writing these covariance kernels is 

K(t,t') — v(t)v(t')mm{q(t),q(t')} for t,t'&[a,b\, (B.l) 

where q[t ) = u{t)/v(t). We assume that e[t) is non-degenerate on the open interval (a, b), 
which imp l ies that th e function q is strictly increasing and continuous on the interval [a, b] [see 
Mehr and McFaddenl (119651). Remark 2], Mor e over, this function is also positive on the interval 
(a, b) [see Remark 1 in Mehr and McFadden (119651 )]. which yields that the functions u and v 
must have the same sign and can be assumed to be positive on the interval (a, b) without loss of 
generality We repeatedly use the following extension of the celebrated Doob’s representation 


see 


Doobl ( 19491 )]. which relates to two Gaussian processes (on compact intervals) by a time- 


space transformation. 

Lemma B.l Let {e(t)| t G [a,5]} be a non-degenerate Gaussian process with zero mean and 
covariance function (IB.II) and let v and q be continuous positive functions on [a,b], such that 
q is strictly increasing and q([a,b}) = q([a,b\). Define the transformations [3 : [a, b\ —> [ a,b ] and 
a : [a, b] —> M + by 


= <? ^(s)), a(s) = v{s)/v0{s)). 
Then the Gaussian process {F(t)| t G [a, 5]} defined by 

e(s) = a(s)e0(s)) 
has zero mean and the covariance function is given by 

K(s , s') = E[d(s)d(s / )] = v(s)v(s') min(g(s), q(s')). 


(B.2) 


(B.3) 


(B.4) 
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Conversely, the Gaussian process eft) can be expressed via e(s) by the transformation 


e(t) = a(t)i(p(t )), 


(B.5) 


where 


P(t) = q 1 (q(t)), a(t)=v(t)/v(p(t)). 


(B.6) 


Proof. Since {e:(i)|t e [a, 6]} is Gaussian and has zero mean, the process defined by (IB. 3D is 
also Gaussian and has zero mean. For the covariance function of the process (IB.3D we have 


E [e(s)e(s')] = a(s)o(s , )E £(ft(s))e(ft(s')) 

= d(s)d(s , )u(/3(s))u(/3(s , ))min |g(0(s)), q{ft{s')) 
= v(s)v(s') min [q(s), q(s')} = K(s, s') . 


The second part of the proof follows by the same arguments and the details are therefore 
omitted. □ 


Remark B.l 

(a) The classical result of Doob is a particular case of (1B.5D when eft) = W(t ) is the Brow¬ 
nian motion with covariance function K(t,s ) = min(t, s). In this case we have vft) = 1, 
qft) = t, aft) = v(t) and B(t) = q(t). Specifically, the Doob’s representation is given by 
eft) = vft)W(qft)) [see Doob (1949)]. 

(b) Both functions fi : [a, b] —$■ [a, b] and ft : [a, b] —> [a, b] are positive strictly increasing 

functions and are inverses of each other; that is, 


/3(f) = B (f), v t e [a, 6] 


(B.7) 


(c) The functions af) and a(-) are positive and satisfy the relation 

aft) ■ a(ft(t)) = = 1, Vie [a, b\. (B.8) 

v(P(t)) v(ft(ft(t))) 1 1 

(d) The properties (b) and (c) imply that the transformation e —> e defined by (IB.5D is the 
inverse of the transformation e —> e defined in (IB.3p . 


B.2 Transformation of regression models 

Associated with the transformation of the triangular covariance kernels there exists a canonical 
transformation for the corresponding regression models. To be precise, consider the regression 
model (II. ip or its continuous time version (II. 2p . where the covariance kernel Kf,-) has the 
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form (IB. ID . Recall the definition of the transformation f3 : [a, b] —> [ a,b ] defined in (IB.61) . 
which maps the observation points tj to tj = fiftj), j = 1,... ,N and define 


f( s ) 


f{.hs)) 

<x{P{s)Y 


i(s) 


e(P( s )) 

oi0{s)Y 


y(tj) 


y(tj) 

a(tj) ’ 


(B.9) 


where s G [a, b] so that /3(s) G [a, b\. The regression model (II.ID can now be rewritten in the 
form 




(B.10) 


The errors eftj) in (IB.10D have zero mean and, by Lemma IB. 1 1 and the identity (IB.8D . their 
covariances are given by 


E [£(ti)e(tj)] = 


(B.ll) 


Hence we have transformed the regression observation scheme (II. Tj) with error covariances 
E[e(tj)e(ij)] = K(ti,tj) to the scheme (IB.10D with covariances (IB. lip . Conversely, we can 
transform the model (IB.10D with covariances (IB.IIP to the model (II.ID using the transformations 




a(0(t)) 


°W)) 


(B.12) 


Lemma B.2 The transformation f —>■ / defined in (1B.9D is an inverse to the transformation 
f —> f defined in (1B.12D . 


Proof. Inserting the expression for / from (jB.9[) into (IB.12D . we have 


m = 


nm 


fwm 


m 


= fit), 


a{fi{fi{t))) a(P(t)) a(t)a(P(t)) 

where we have used the identities j3(/3(t)) = t, see (IB.7D . and aft) a(/3ft)) = 1, see (IB.8D . 


B.3 Transformation of designs 


In this section we consider a transformation of the matrix-weighted designs under a given 
transformation of the regression models. In the one-parameter case with m — 1, these matrix- 
weighted designs become signed measures; that is, signed designs as considered in Section [21 In 
this section, it is convenient to define all integrals as Lebesgue-Stieltjes integrals with respect 
to the distribution functions of the measures ( and (. 

To be precise, let d£(t) = Ofit)d(ft) be a matrix-weighted design on the interval t G [a, b\. 
Recalling the definition of a, a and /3,/3 in (IB. 2D and (IB.6P we define a matrix-weighted design 
di(s) = 0|(s)dC(s) by 

d((s) = d(((3(s)) and O g(s) = a 2 (/3(s))Ofi/3(s)). (B.13) 
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Note that ( and £ are probability measures on the intervals [a,b\ and [a, b], respectively. Sim¬ 
ilarly, for a given matrix-weighted design d£(s) = 0^(s)d((s) on the interval [a, b] we define a 
matrix-weighted design d£(t) = 0^(t)d((t) on the interval [a, b] by 

dC(t) = dcm)) and O e(t) = d 2 m))0^m))- (B.14) 

Similar to Lemma IB. 21 we can see that the transformation £ —» £ defined by (IB.1411 is the inverse 
to the transformation £ —>■ £ defined by (IB. 13(1 . 

For the following discussion we recall the definition of the covariance matrix D(£) in (13. lip . For 
the model (IB. 101) . the covariance matrix of the design d£(s) = 0^(s)d((s), defined by (IB.lSp . 
is given by 

D(£) = M- 1 (£)B(£)(M- 1 (£)) r , (B.15) 


where 


B(f) = 



K(t,s)6f(t)f(t)(6f(s)f(s)) T d((t)d((s), M(|)= / OMf(t)f T (t)d((t) 


and the kernel K is defined by (IB.41) . 


Theorem B.l For any matrix-weighted design d£(t) = 0^{t)d((t) and the corresponding matrix- 
weighted design £ defined by (IB.131) . we have D(£) = D(£). In particular, D* = D*, where D* 
and D* are the covariance matrices of the BLUE in the continuous time models (11.21) and in 
the model { d T f(s ) +d(s)|s 6 [a, b]}, respectively. 


Proof. Using the variable transformation /3(s) = t and (IB.91) . we have 


M(£) = / 6 i (s)f(s)f T (s)dC(s) = 


odfamfo)) f T Cm) 

a(P(s)) «(/3(s)) 


a 2 (^(s))dC(^(s)) 


= / 0^t)f(t)f T (t)dC(t) = M(£). 


Next, we calculate the corresponding expression for B(£), that is 


B(£) = 




K{x,y)6^(x)f(x)(6^(y)f{y)) T dC(x)dC(y) 

v(x)v{y) min(g(a:), q(y))d^x)f(x)(6^y)f(y)) T d((x)d((y) 



v(x)v(y) min(g(x), q(y)) 


°dP( x ))f(p{x)) ( 0 t(P(y))f(P(y))? 


<*{${x)) 


<*{P(y)) 


X 


a 2 (/3(x))d((P(x))a 2 (/3{y))d({(3(y)) 
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Define s = (3(x) and t — j3 (y) so that x = (3 1 (s) = /3(s) and similarly y = (3(t). Changing the 

variables in the integrals above we obtain 

B(i) = J J v(P(s))v(P(t)) mm(q(/3(s)),q(/3(t)))0^{s)f(s)(0^(t)f(t)) T a(s)a(t)dC{s)dC(t). 

Using the definition of (3 in (IB. 61) yields q((3{t)) = qiq^iqit))) = q{t) and by the definition of 

a in (IB.61) we finally get 

B(f) = J J v(/3(s))v(P(t)) min(g(s), q(t))0^s)f(s)(0^{t)f(t)) T v(P(t)) 

= J j ™n(q(s),q(t))Ot:(s)f(s)(0£(t)f(t)) T v(s)v(t)d((s)d((t) = B(^) . 

The result D(^) = D(f) follows now from the definitions (13.111) and (IB.151) . 
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